On Seeking Consensus Between Document Similarity Measures

نویسنده

  • Mieczyslaw A. Klopotek
چکیده

This paper investigates the application of consensus clustering and meta-clustering to the set of all possible partitions of a data set. We show that when using a ”complement” of Rand Index as a measure of cluster similarity, the total-separation partition, putting each element in a separate set, is chosen.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بررسی قابلیت بهکارگیری سنجه های مرکزیت به عنوان شاخصهای ارتباط استنادی مدارک در بازیابی اطلاعات رابطه ای: مطالعۀ مقدماتی

Purpose: this is a pilot study tends to investigate correlation between centrality measures with bibliographic coupling as a well-known citation-based document similarity measure.  Methodology: using citation analysis method, 40 research articles belonging to four engineering/pure disciplines (Physics, Chemistry, Biology, and computer) and four Humanities and Social disciplines (Economics, Edu...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

SYDE 676 Project Report – Fall 2002 Web Document Clustering Using Phrase-based Document Similarity

Measuring the similarity between documents is an essential operation in text mining, especially document clustering. The traditional method of finding the similarity between documents has always been based on extracting individual words from the documents, and using heuristics to give weights to those features. Standard methods in data mining are then used to find the similarity between documen...

متن کامل

SOME SIMILARITY MEASURES FOR PICTURE FUZZY SETS AND THEIR APPLICATIONS

In this work, we shall present some novel process to measure the similarity between picture fuzzy sets. Firstly, we adopt the concept of intuitionistic fuzzy sets, interval-valued intuitionistic fuzzy sets and picture fuzzy sets. Secondly, we develop some similarity measures between picture fuzzy sets, such as, cosine similarity measure, weighted cosine similarity measure, set-theoretic similar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Fundam. Inform.

دوره 156  شماره 

صفحات  -

تاریخ انتشار 2017